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SUMMARY 


We report genome sequence data from six individuals excavated from the base of a medieval well at a site in 
Norwich, UK. A revised radiocarbon analysis of the assemblage is consistent with these individuals being 
part of a historically attested episode of antisemitic violence on 6 February 1190 CE. We find that four of these 
individuals were closely related and all six have strong genetic affinities with modern Ashkenazi Jews. We 
identify four alleles associated with genetic disease in Ashkenazi Jewish populations and infer variation in 
pigmentation traits, including the presence of red hair. Simulations indicate that Ashkenazi-associated ge- 
netic disease alleles were already at appreciable frequencies, centuries earlier than previously hypothesized. 
These findings provide new insights into a significant historical crime, into Ashkenazi population history, and 


into the origins of genetic diseases associated with modern Jewish populations. 


INTRODUCTION 


In 2004 construction workers excavating land in central Norwich, 
UK, as part of the Chapelfield shopping center development 
recovered human skeletal elements from their spoil.’ Subse- 
quent archaeological investigations led to the discovery and 
excavation of a probable well containing the commingled re- 
mains of at least seventeen people. The stratigraphic position 
of the remains, their completeness, and state of articulation sug- 
gested that they had all been deposited in a single event shortly 
after their death. The overrepresentation of subadults and the 
unusual location of the burial outside of consecrated ground 
suggested that they may have been victims of a mass fatality 
event such as famine, disease, or mass murder. 

Pottery sherds from the well were dated typologically to 12th- 
14th centuries CE, and two initial radiocarbon determinations on 
the skeletal remains placed these in the 11th-12th centuries. ' 
The most prominent historically attested mass death in Norwich 
within this date range was in 1190 CE when members of the 


Jewish community were killed during antisemitic riots precipi- 
tated by the beginning of the Third Crusade,” although the num- 
ber of individuals killed is unclear.° Norwich had been the setting 
for a previous notable event in the history of medieval antisemit- 
ism when, in 1144 CE, the family of William of Norwich claimed 
that local Jews were responsible for his murder, an argument 
taken up by Thomas of Monmouth through the first documented 
invocation of the blood libel myth. This represents the beginnings 
of an antisemitic conspiracy theory that persists up to the pre- 
sent day.* The possibility that the remains found at the Chapel- 
field well site were those of the victims of antisemitic violence 
is given further support by the site’s location just to the south 
of the medieval Jewish quarter of the city.'*° However, no addi- 
tional archaeological evidence linked the human remains to a 
specific historical event or group of people. During the High Me- 
dieval period (ca. 1000-1300 CE), Norwich witnessed a number 
of outbreaks of large-scale violence,° and additional data were 
therefore required to test the hypothesis that these individuals 
were of Ashkenazi Jewish descent. 
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Judaism is a shared religious and cultural identity, with endog- 
amous marriage practices and distinctive diasporic histories of 
communities worldwide, particularly a Levantine origin and com- 
plex history of migrations over the last ~2.5 millennia. Present-day 
Ashkenazim are descendants of medieval Jewish populations 
with histories primarily in northern and eastern Europe. As a result, 
they carry distinctive ancestries,’”~'° and Jewish and non-Jewish 
medieval individuals living in the same regions would likely show 
characteristic patterns of genetic variation. '' 

Hereditary disorders in Ashkenazi Jewish populations have 
been the focus of considerable medical research, '* '° with ge- 
netic screening now commonplace to mitigate risks.'’ Their 
prevalence is generally attributed to strong genetic drift during 
Ashkenazi population bottlenecks,'°'®?° coupled with high 
endogamy,”'''** although other processes such as heterozy- 
gote advantage have been proposed.”*:*° Candidate population 
bottlenecks include the phase of dispersion following the 
destruction of the Second Temple in 70 CE, the formation of 
Ashkenazi communities in northern Europe during the medieval 
period, antisemitic persecution arising from the Crusades, un- 
founded reprisals for the Black Death, and the movement from 
western and central Europe to eastern Europe that preceded 
rapid population growth from the 15th to 18th centuries. '9:21:27.2° 

No genomes from known Jewish individuals are currently 
available from the medieval period or earlier, largely because 
exhumation and scientific testing of Jewish remains are pro- 
hibited. Such data could inform on the migration and admixture 
histories of Jewish populations. Furthermore, the presence of 
any pathogenic variants would provide valuable clues to the or- 
igins and spread of Ashkenazim-associated genetic disorders. 
Here, we examine results from radiocarbon dating and genetic 
analyses of the Chapelfield individuals to better establish who 
they were, when they died, and the nature of their death and 
burial, and identify potential broader implications for Ashke- 
nazim population history and genetics. 


RESULTS 


Radiocarbon dating 

The two previously published radiocarbon dates' were supple- 
mented by three further radiocarbon dates obtained by directly 
sampling the human remains (Method details: Radiocarbon 
dating). All five radiocarbon dates were consistent with each 
other (Acomb = 75.8), so they were calibrated and modeled as 
a single event using the OxCal function Combine.”*°° This indi- 
cated that the bodies were deposited 1161-1216 calCE (cali- 
brated radiocarbon years in the CE; 95% confidence) or 1165- 
1207 calCE (68% confidence). This date range is consistent 
with the only historically attested antisemitic massacre in Nor- 
wich in 1190 CE (Figure 1). However, this range also encom- 
passes the so-called Great Revolt of 1174 CE when many people 
were killed during the sack of Norwich by Hugh Bigod.° 


Ancient DNA 

NGS libraries from 25 skeletal elements were screened, and nine 
libraries from six individuals were selected for higher coverage 
sequencing on the basis of endogenous DNA content > 4.5% 
(Data S1A). This resulted in autosomal average coverage 
(average read depth) per individual ranging from 0.16x to 
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13.81 x and a mean read length of 68 bp from the six individuals 
(Data S1B). In addition to established methods for authenticating 
aDNA,°''** we studied DNA molecule degradation using the 
lambda parameter to estimate true fragment length®® and 
compared results with a depositionally varied panel of ancient 
genomes (Method details: DNA fragmentation). We found signif- 
icant variation in DNA fragmentation among the sequenced 
Chapelfield samples, indicating this cannot be predicted by 
depositional history. 


Familial relationships 

We inferred familial relationships among the Chapelfield group 
on the basis of pairwise relatedness coefficients and summary 
statistics (Method details: Inferring familial relationships and 
inbreeding). From this, it was inferred that three individuals 
were full-sibling sisters: SB606 (from Deposit Sk 75), a 10- to 
15-year-old; SB671 (Deposit Sk 78), a young adult; and SB605 
(Deposit Sk 69), a 5- to 10-year-old (Data S1G). These sisters 
were found to share the mitochondrial haplotype H5c2. In addi- 
tion, SB696 appears more distantly related to this group, and 
SB676 is in turn distantly related to SB696. 

Individual SB604 had multiple long runs of homozygosity 
(RoHs) comprising large portions of many chromosomes (Fig- 
ures 2 and S3), and an inbreeding coefficient (0.21) close to 
that expected for the offspring of a first-order union. The very 
long RoHs (up to approximately 40 cM) identified in this individ- 
ual indicate a very recent inbreeding event. Additionally, the pro- 
portions of long RoHs in SB676 and SB605 are consistent with 
their parents having been second-degree relatives (Figure S3). 
We exclude the possibility that the observed RoHs are entirely 
explained by low effective population size, as we would expect 
the distribution of RoH lengths to show an excess of short 
RoHs, which we do not observe (Figure 2). 


Genetic ancestry 

Present-day Ashkenazim are primarily descended from dias- 
poras who settled in various parts of northwestern and, later, cen- 
tral and eastern Europe through the medieval period.** As such, 
they represent the present-day population that we would expect 
to be genetically most similar to Jewish individuals in medieval 
England. In addition, historical sources indicate that the Norwich 
Jewish community were descendants of Ashkenazi Jews from 
Rouen, Normandy, who were invited to England by William the 
Conqueror after 1066 CE.°° We therefore explored the genetic af- 
finities of the six Chapelfield individuals using principal compo- 
nent analysis (Figure 3) and tested whether modern Ashkenazi 
ancestry is better explained by the ancestry of the Chapelfield 
assemblage or a mixture of modern populations acting as proxies 
for ancient admixture components (e.g., Middle Eastern and 
Southern and Eastern European). We inferred ancestry propor- 
tions for modern Ashkenazi with qpAdm,°° with Chapelfield, 
Turkish Jews, Sicilian, French, and Polish as potential sources, 
and found the best model to be one of 100% Chapelfield (p = 
0.65; Data S11; by convention values below 0.01 indicate a poor 
fit). We also modeled Chapelfield ancestry as a mixture of modern 
populations, which we use as proxies for hypothesized ancestry 
components: Turkish Jews, Sicilian, French, and Polish. We esti- 
mate a mixture of ~33%, ~67%, ~0%, and ~0%, respectively 
(p = 0.88). These results are consistent with a previous 
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Figure 1. Details of the archaeological and radiocarbon information from the Chapelfield well deposit 


(A) Location of the site in Norwich, UK. 
(B) West-facing vertical section drawing of the Chapelfield well shaft. ' 
(C) Probability distribution for the date of deposition of the human remains based on 5 radiocarbon dates modeled as a single event using the Combine function in 


OxCal 4.4 and the IntCal20 curve. 1165-1179 CE (31.4% probability), 1190-1207 (36.8% probability), and 1161-1216 CE (95.4% probability). y? test, df = 4, 
T = 4.882 (5% 9.488); agreement, n = 5; Acomb = 75.8% (An = 31.6%),°""° from isotopic data in Table S1. 
See also Method details: Radiocarbon dating. 
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demographic model,°’ which places the introgression of Eastern 


European ancestry after the date of these individuals. 

Uniparental haplogroup calls for all Chapelfield individuals 
also support genetic affinities to the Near East and in some cases 
to Ashkenazi Jewish populations (Method details: Sex and uni- 
parental haplogroups). Specifically, Ashkenazi Jews form the 
majority of modern carriers for mitochondrial haplogroup 
H5c2,°°:*°-*" in particular the back mutation at 16304, observed 
in the three sisters SB605, SB606, and SB671 (Table S3). The Y 
chromosome haplotype of SB676 (E1b1b1b2a1b1a) is within 
haplogroup E-M34, which is common in semitic language 
speakers and has a frequency of 11.7% among Ashkenazim 
for the parent haplogroup E-M12332. Similarly, the parent sub- 
clades of SB604 (J1a2a1a2d2b2) and SB696 (T1a1a) are partic- 
ularly associated with Levantine ancestry.*7“° 

In order to temporally resolve genomic affinities, we estimated 
coalescence rates using Colate** based on allele ages inferred 
from a genealogy of a diverse set of modern populations.*° In 
panmictic population models, intra-group coalescence rates 
are expected to be inversely related to effective population 
size. Between groups, they can be interpreted as a function of 
the intensity of gene flow prior to the coalescence event. They 
are therefore informative on population size history and demo- 
graphic processes like admixture. 

In the epoque roughly corresponding to 119 BCE to 1140 CE 
(Figure 4A), we find that Europeans are separated from modern 
Ashkenazi Jews and Middle Eastern individuals on the first axis 
of variation, with Chapelfield individuals in between but closer 
to Europeans. The second dimension sets modern Ashkenazi 
Jews apart, and to a lesser extent the Chapelfield individuals. 
Figure 4B summarizes the same pairwise coalescence rates by 
showing inter- and intra-group average pairwise coalescence 
rates (apCRs). Inter-group apCRs mirror the patterns of the first 
dimension of the multidimensional scaling (MDS) plot, while 
intra-group apCRs are highest in modern Ashkenazi Jews, fol- 
lowed by Chapelfield individuals, and can be seen as reflecting 
the variation shown in the second dimension of the MDS plot. 
The Chapelfield individuals have the highest inter-group apCR 
with modern Ashkenazi Jews. The relative ranking of apCRs 
for the older epoque roughly from 3278 BCE to 119 BCE shown 
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in Ashkenazi 
individuals. 
We interpret these patterns as showing (1) some degree of 
population continuity between Chapelfield individuals and mod- 
ern Ashkenazi Jews, consistent with the qoAadm results; (2) mod- 
ern Ashkenazi Jews being a mixture of Middle Eastern and Euro- 
pean populations, as, for example, found by Xue et al.°’ and 
consistent with the gqoAdm results; (3) both modern Ashkenazi 
Jews and Chapelfield individuals having smaller effective popu- 
lation sizes than pan-Middle Eastern and -European popula- 
tions, which is not sufficient to infer but consistent with a poten- 
tial population bottleneck in Ashkenazim before 1140 CE; and (4) 
modern Ashkenazim having experienced additional bottlenecks 
or increased endogamy after 1140 CE. 


Jews and Chapelfield 


Phenotypes and genetic disorders 

Three individuals have sufficient genotyped SNPs to pass the 
threshold for HirisPlex*® pigmentation phenotype inference 
(Method details: Inferring pigmentation phenotypes; Table S4). 
Two individuals were inferred to have had brown eyes, one 
with “dark” and one with “light” hair (SB605 and SB676, respec- 
tively), while the O- to 3-year-old boy (SB604) was inferred to 
have had blue eyes and red hair, the latter of which is associated 
with historical stereotypes of European Jews.*” 

We examined the six Chapelfield genomes for variants associ- 
ated with hereditary diseases in Ashkenazi Jews'?:'®4°°° at 159 
loci. Previous studies have attributed the high frequencies of 
certain genetic disease alleles in Ashkenazi Jews principally to 
high rates of drift during population bottlenecks, variously hypoth- 
esized to have occurred around 1,100-1,400 CE,”' ca. 900 CE,'® 
~1,300 CE,** 33 generations ago,’' or 30 generations ago.'° 
Because the Chapelfield individuals lived prior to, or at the start 
of these hypothesized bottlenecks, we would not expect the 
comparatively high frequencies of modern Ashkenazim-associ- 
ated disease alleles to have been reached in the population to 
which they belonged. Rather, we expect the population at this 
time to have disease allele frequencies that are more typical of 
modern European populations. 

Across the 159 Ashkenazi-associated genetic disorder loci 
considered, we amassed a total of 4,755 reads for the six Chap- 
elfield individuals. Of these, 45 reads from 35 loci were genetic 
disorder alleles (one locus had 4 reads, two loci had 3 reads 
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Figure 3. Principal component analysis 
(PCA) exploring the genetic affinities of the 
six Chapelfield individuals 

We projected the six Chapelfield genomes on a 
PCA defined by variation among modern western 
Eurasian population samples, including modern 
Jewish individuals. All six Chapelfield individuals 
project well away from present-day British sam- 
ples, as well as northern Europeans more gener- 
ally. Instead, they partially overlap with Southern 
Europeans, close to Cypriots, modern Ashkenazi, 
Turkish, and North African Jews. These results are 
consistent with the Chapelfield individuals having 
Jewish ancestry (cf. Kopelman et al.°°). 


European (non-Finnish) and modern 
Ashkenazi population allele frequencies 
in the gnomAD database,°’ to explore 
how many disease allele reads should 
be expected in our sample at different 
read error rates. We sampled A,C,G,T nu- 
cleotides (at each locus, for each individ- 


each, three loci had 2 reads each, and 29 loci had 1 read each). 
However, it is important to note that some of these reads will be 
type 1 errors, falsely indicating the presence of a disease allele. 
Therefore, we simulated datasets assuming both the modern 


ual) from a multinomial distribution, using the observed total read 
depth as the number of trials. To account for read errors, we 
introduce a read error parameter « (assumed constant across 
all loci) to adjust the gnomAD allele frequencies, which were 
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Figure 4. Pairwise coalescence rates between European, Middle Eastern, modern Ashkenazi, and Chapelfield individuals 
(A) Multidimensional scaling (MDS) of pairwise coalescence rates. Rates between the chromosomes of a single individual are not included. 
(B and C) Heatmaps of average pairwise coalescence rates (apCRs) between and within groups of individuals. Color code for inter-group comparisons goes from 
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Eur; modern Ashkenazi Jews, AJ; Chapelfield, Ch. 


Current Biology 32, 1-10, October 24, 2022 5 


Please cite this article in press as: Brace et al., Genomes from a medieval mass burial show Ashkenazi-associated hereditary diseases pre-date the 


12th century, Current Biology (2022), https://doi.org/10.1016/j.cub.2022.08.036 


© Cellress 


OPEN 
OPEN 


) N 


0.015 7 
0.014 4 
0.013 4 on 
0.012 — _ 
0.011 + = 
0.010 4 
0.009 4 = 
0.008 4 — 


0.007 + — 


read error rate 
fl 


0.006 4 


0.005 4 


0.004 


0.003 


| 


' I 
ie 
i 


0.002 


0.001 4 


0.000 


i | 


Current Biology 


=~ ~=Modern Ashkenazi Jews 
Modern European (non-Finnish) 
Chapelfield 


: 


20 30 


i=} 
— 
=) 


40 50 60 70 


disease alleles with a depth greater than zero 


Figure 5. Heatmap of simulation results for genetic disorders 


Comparison of test statistic (number of disease alleles with a read depth > 0) of observed Chapelfield data with 22,500 simulations under the assumption of 
modern Ashkenazi Jewish population allele frequencies, and modern European (non-Finnish) population allele frequencies, across a range of plausible read error 
rates. Methods underlying this analysis are described in Method details: Analysis of Ashkenazi-associated Mendelian disorders, and observed read depth per loci 


in the Chapelfield samples are shown in Figure S4. 


then used as the multinomial probabilities. As a simple test sta- 
tistic to compare our observed data with simulations, we use the 
total number of disease alleles with one or more reads. Figure 5 
illustrates that across a range of plausible error rates (0% to 
1.5%), our test statistic for the Chapelfield data is typically ex- 
pected given modern Ashkenazi Jewish population allele fre- 
quencies (one-tail test for greater or equal to 35: p = 0.8143), 
but unlikely given modern European (non-Finnish) frequencies 
(one-tail test for greater or equal to 35, p = 0.0048). 

To explore this further, we formulated a likelihood function to 
calculate the exact probability of the six individuals’ observed 
allele reads at the 159 disease loci, given the allele frequencies 
of any proposed population. The likelihood function utilizes the 
read error parameter, summing the probability of the observed 
data for all 10° permutations of the ten possible genotypes at a 
single locus. We assume loci are independent; therefore, the 
overall probability is simply the product of each locus-specific 
probability. This allows us to calculate the likelinood of either 
the modern European (non-Finnish) or modern Ashkenazi popu- 
lations, given the data. To further account for uncertainty in the 
allele frequencies of these modern populations, we use the gno- 
mAD* counts of alleles as parameters in a Dirichlet distribution 
(with a uniform prior) to sample across possible population allele 
frequencies. Figure 6 illustrates these likelinoods under 5,000 
random samples, each with a different read error rate sampled 
from a uniform distribution between 0% and 1.5%. The 
maximum likelihood read error rate estimates are notably similar 
(0.87% and 0.94%, respectively), and crucially these results 
show that the data are 4,615 times more probable under a model 
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that these individuals were sampled from the modern Ashkenazi 
population than they were sampled from the modern non-Finnish 
European population. This approach assumes the six individuals 
are randomly sampled from either population. Further assess- 
ment of the effect of this assumption given that three individuals 
are siblings suggests that in the case of these data our assump- 
tion has a conservative effect on the likelihood ratio (Method de- 
tails: Analysis of Ashkenazi-associated Mendelian disorders). 
The likelihood ratios (LRs) at the vast majority of loci (155 of 159) 
have little influence on this (mean = 0.967, SD = 0.054, range 
0.777 to 1.125), since loci with no reads are equally probable 
(certain) under either population model, and those with only 
non-pathogenic allele reads are marginally more probable under 
the European population in which the pathogenic alleles are ata 
lower frequency. Instead, the substantial overall difference in 
likelihoods is driven by variant NC_000021.9:g.32602299G>C 
(rs202094637, 21:33974609G>C) linked to primary ciliary 
dyskinesia®° where SB676 has allele reads of A= 0, C= 3, G= 
6, T = 0 (LR=113.4); variant NC_000007.14:g.83961537G>A 
(rs138952094, 7:83590853G>A) reported as linked to delayed 
puberty'° where SB605 has allele reads of A = 3, C = 0, G=6, 
T = O (LR=67.8); variant NC_000005.10:9.112839514T>A 
(rs1801155, 5:112175211T>A) linked to cancer predisposition '® 
where SB676 has allele reads of A= 4, C = 0, G=0,T=6 
(LR=48.6); and variant NC_000022.11:9.50528591C>T (rs188 
802138, 22:50967020C>T) linked to mitochondrial DNA deple- 
tion syndrome 1'° where SB696 has allele reads of A = 0, C =0, 
G = 0, T = 1 (LR = 2.7). We note that the interpretation of 
NC_000007.14:g.83961537G>A as pathogenic has been 
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Figure 6. Determining the probability of observed genetic disorder reads in the Chapelfield sample, given different population origins 

Log likelihoods compare the probability that the observed data were generated under a model of either the modern Ashkenazi population or the modern European 
(non-Finnish) population, using respective population allele frequencies in the gnomAD database. Each population model has its own parameter «, which de- 
termines the allele read error probability and is assumed constant across all loci. The maximum likelihood ratio indicates the data are 4,615 times more probable 


under the Ashkenazi model than the European model. 


recently changed to uncertain on ClinVar,°° but this does not 
affect the overall conclusion from our analyses that disease- 
associated alleles rose to frequencies similar to that in modern 
Ashkenazi populations by the 12th century. 


DISCUSSION 


We identify the presence of multiple related individuals at the 
Chapelfield site, with ancestry similar to modern Ashkenazi 
Jews, and a combined radiocarbon date of 1161-1216 calCE 
(95% confidence). These findings are consistent with accounts 
of the 1190 CE antisemitic attacks,” involving the targeting of 
households. It is therefore highly probable that the Chapelfield 
remains were those of victims of the 1190 CE riots, despite the 
challenges of associating archaeological sites with specific his- 
torical events. The red hair pigmentation prediction for SB604 
is notable as medieval antisemitic tropes often incorporated an 
association between Jews and red hair.*’ Our results also indi- 
cate that Ashkenazim-associated disease alleles rose to near- 
modern frequencies prior to the 12th century CE. Since the 
majority of these alleles are at comparatively low frequencies 
in Shephardi Jews, the population bottleneck most likely to 
have resulted in elevated frequencies is one associated with 
the formation of Ashkenazim communities in northern Europe 
during the early medieval period. 
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EN A 


RESOURCE AVAILABILITY 


Lead contact 
Further information and requests for resources and reagents should be directed to and will be fulfilled by the lead contact, lan Barnes 
(i.barnes@nhm.ac.uk). 


Materials availability 
This study did not generate new unique reagents. 


Data and code availability 


@ The accession number for the DNA sequences reported in this paper is ENA: PRJEB55223 (https://www.ebi.ac.uk/ena/data/ 
view/PRJEB55223). 
e@ Any additional information required to reanalyze the data reported in this paper is available from the lead contact upon request. 


EXPERIMENTAL MODEL AND SUBJECT DETAILS 


Description of archaeological and historical background to human remains 

In 2004, workers constructing the Chapelfield shopping center (62.626N, 1.292E) in Norwich, Norfolk, UK, recovered human remains. 
Subsequent archaeological investigations by NAU Archaeology led to the discovery and excavation of a circular shaft, interpreted to 
have been originally used as a well. Further commingled human remains were found in partial articulation in a single mass burial de- 
posit at the base of the well. The deposit was less than 0.5 m deep and 1 m in diameter, but contained the highly compacted remains 
of at least seventeen people. The well shaft had been truncated by the machine digger, which means that this number is probably an 
underestimate of the number of people originally deposited. Sediment surrounding the remains comprised soils from outside the well, 
suggesting bodies had been rapidly buried by soil after deposition. The well stratigraphy suggested that the well was disused and dry 
at the time the human remains were interred, and probably being used as a midden. Osteological analysis’ identified at least six 
adults and eleven sub-adults. The subadult remains comprised at least one adolescent, two 10-15-year-olds, three 5-10-year- 
olds, three 3-5-year-olds and two children aged 0-3 years. The adult remains included both males and females. This overrepresen- 
tation of subadults suggests a catastrophic profile, where people of all ages had a similar risk of death, and the compaction of the 
remains suggested that they had been deposited in a single event. Patterns of skeletal articulation and completeness indicated that 
the skeletons were originally interred as complete, intact bodies, with little to no delay between their death and deposition. The Chap- 
elfield burials thus appear to represent a catastrophic mortality event: a famine, epidemic or mass killing. In some cases the bones of 
the legs were higher up in the sediment than the skulls from the same skeletons, suggesting bodies had been deposited in the well 
head-first. There were no signs of trauma indicative of any mass killing event, beyond broken ribs that could have been produced 
when the bodies hit the bottom of the well, although it is possible they had been intentionally killed by a method which left no signs 
of trauma on the bones. There were no signs of skeletal trauma of a type commonly found in individuals who attempted to break a fall, 
providing some indication that the people had died before their bodies were deposited in the well. Some bones showed extremely 
localized brown-black or gray-blue discolorations and longitudinal splitting which can indicate exposure to high temperatures, 
although these changes can also be produced by diagenesis and mineral staining. 


Background sampling information 

In 2011, eight skeletal elements were sampled as part of the BBC series ‘History Cold Case’ (Series 2, Episode 3 ‘The Bodies in The 
Well’). DNA was extracted and the sections of the mitochondrial genome were targeted through PCR amplification. These methods 
and results are not described here, as they have been superseded by newer methods and data. In 2016 we accessed five additional 
skeletal elements, and these were sampled along with resampling of three of the most promising skeletal elements (based on the 
2011 PCR results). DNA was extracted and NGS libraries constructed as described in the methods section below. In 2019 we 
were given permission to sample a further seventeen skeletal elements. DNA extraction and NGS library construction were also car- 
ried out as described below, with two additional libraries constructed for two of the most promising extracts from the 2016 sampling 
effort. 

Ethical considerations 

Rabbinical law prohibits exhumation of Jewish remans for most purposes. However, these remains were not thought to be those of 
Jews when excavated and initially analysed. Once the possibility that these individuals might have been Jewish was established, 
subsequent analyses were conducted with the cooperation and support of the Norwich Hebrew Congregation, and with the approval 
of the Office of the Chief Rabbi. The remains were reburied in 2013 at the Jewish cemetery in Norwich. There was a multi-faith reburial 
ceremony to accommodate the uncertainties surrounding the identity of the burials at that time. A commemorative plaque was also 
placed near the site of the well. 
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METHOD DETAILS 


Radiocarbon dating 

Two radiocarbon dates had previously been obtained at the University of Waikato (New Zealand) by Norvic Archaeology and reported 
in Emery et al.' Three previously unreported radiocarbon dates were commissioned by SHINE TV as part of the History Cold Case TV 
Series. The new dates were generated by Accelerator Mass Spectrometry (AMS) at the SUERC AMS facility, from extracted skeletal 
collagen, pretreated and reduced to graphite. All five radiocarbon dates were modelled in OxCal 4.4 using the IntCal20 calibration 
curve’?“° The commingled skeletons were recovered from a single stratigraphic unit and were in partial articulation, suggesting 
that they had been buried in a single event soon after death.’ The skeletons therefore meet assumptions required for their radio- 
carbon dates to be modelled with Bayesian inference as representing a single event using the Combine function in OxCal 4.4 to pro- 
duce a refined posterior density estimate. The Combine model produces good agreement indices (Acomb = 75.8) and produces pos- 
terior density estimates for the deposition of the bodies of 1161-1216 calCE (95% confidence) or 1165-1207 calCE (68% confidence). 
Accelerator Mass Spectrometry (AMS) data generated is included in Table S1, and modelled likelihood distributions for these dating 
results are presented in Figure 1C. 


Ancient DNA extraction and sequencing 

Bone powder (27-67 mg) was removed using a Dremel drill at slow speed. DNA was extracted following a modified standard protocol 
after Dabney et al.°° The protocol was modified by replacing Zymo-Spin V columns with Roche High Pure Viral Nucleic Acid Large 
Volume spin columns, and two final elution steps of 50 il (total 100 pl) TET buffer (10 mM Tris-HCl, 1 mM EDTA, 0.05% Tween-20, pH 
8.0). Dual indexed libraries were constructed based on standard double-stranded DNA protocols.*°° Prior to blunt end repair, the 
DNA extracts were partially UDG treated, 25 ul DNA extract, 3.6 pl USER enzyme (New England Biolabs), incubated at 37°C for 
30 mins, followed immediately by adding 3.6 jl UGI enzyme (New England Biolabs) and incubation at 37°C for 30 mins. Reaction 
purification steps were carried out using minelute purification kits (Qiagen). Indexing PCR step used AmpliTaq Gold (Fisher Scientific) 
DNA polymerase. All pre PCR steps were carried out in the dedicated aDNA laboratory at the Natural History Museum, London 
(NH). All 27 libraries (Data S1A) were screened on a NextSeq sequencing platform at the NHM using mid output 75 PE (150 cycles) 
kits. Nine libraries from six individuals were selected as the most likely to generate unique DNA reads (based on endogenous content 
and complexity), and deep sequenced on a NovaSeq 6000 S4 flow cell with v. 1 chemistry, for 200 cycles (Data S1B). 


QUANTIFICATION AND STATISTICAL ANALYSIS 


DNA sequence data processing and alignment 

Sequencing data was analyzed with bioinformatics methods accounting for the properties of aDNA. Residual adapters were removed 
from both read pairs prior to merging using AdapterRemoval,°° discarding reads shorter than 30 base pairs, trimming and collapsing 
forward and reverse reads with default parameters. Collapsed reads were aligned against the thousand genomes reference genome 
(hs37d5) with bwa mem and filtered for a minimal mapping quality of 30 with samtools.°*:°° BAM files were sorted with sambamba, °° 
read groups set with Picard,°’ files merged and PCR duplicates marked with sambamba. GATK v3.8’° was used for realignment 
around known InDels listed in the Broad ‘1000G phase1’ and ‘Mills and 1000G gold standard’ resource files (provided as part of 
GATK). Additionally, we used a custom file with InDels causing diseases frequent in Jewish populations (Data S1D), from Carmi 
et al.'° 


Ancient DNA authentication and genotype calling 

We inferred empirical post-mortem damage (PMD) patterns and recalibrated base quality scores with ATLAS.°*° PMD patterns at the 
first and last 50 base pairs of reads are shown in Figure S1 and confirm ancient DNA authenticity. Contamination estimates based 
upon mitochondrial genomes on a per fragment basis using ContamMix®' (Figure S2) indicate no contamination. After computing 
PMD and recalibration patterns, genotypes were called with ATLAS, generating pseudo-haploid majority-allele calls (ATLAS options 
‘task=call method=majorityBase’) for the sites covered by the 1240k capture array,’° and diploid Bayesian maximum a posteriori 
calls (ATLAS options ‘task=call method=Bayesian’) for all sites with theta prior fixed at 0.001 (ATLAS options ‘prior=theta fixed- 
Theta=0.001’) and equal base frequencies (ATLAS option ‘equalBaseFreq’). In both cases two bases were trimmed from the ends 
of the reads. SAMtools-mpileup*” was also used for studying observed disease alleles probabilistically. 


Sex and uniparental haplogroups 
Chromosomal sex was inferred from X and Y chromosomal read ratios following Skoglund et al.’' (Figure S2). We note that individual 
SB696 did not yield a clear assignment using this method, as the R, statistic minimally but confidently lies below the threshold for an 
XY call. We therefore computed the R, statistic presented in Mittnik et al.’ which confidently classifies SB696 as male (R, 95% Cl 
[0.49, 0.55], Pearson’s r 0.99). 

Mitochondrial and Y-chromosomal haplogroups were inferred with Haplogrep®° and Yleaf,°' respectively (see Table S2). Table S3 
provides further details of mitochondrial mutations observed among the chapelfield individuals (assigned through Hap/ogrep); this 
supports the association of observed uniparental haplotypes with Ashkenazi ancestry, discussed above. 


f©! 


Current Biology 32, 1-10.e1-e6, October 24, 2022 e3 


Please cite this article in press as: Brace et al., Genomes from a medieval mass burial show Ashkenazi-associated hereditary diseases pre-date the 


12th century, Current Biology (2022), https://doi.org/10.1016/j.cub.2022.08.036 


© CelPress Current Biology 


DNA fragmentation 

To further assess DNA degradation/fragmentation across our six similarly deposited individuals, we compared these individuals to a 
dataset of 95 archaeological individuals with varied depositional histories (age range ca. 40 to 10579 BP). We used the lambda 
parameter to estimate true fragment length®* and control for differences in sequencing platforms (NextSeq and NovaSeq). We 
randomly selected six subsamples 1000 times from our pool of 95 archaeological individuals and plotted the standard deviation 
of lambda for each random subsample onto a density distribution. 95% confidence intervals were calculated and the standard 
deviation from our six samples added to the plot. 

To assess whether the highly similar depositional histories of the Chapelfield individuals would lead to similar DNA degradation/ 
fragmentation patterns across individuals we used the lambda parameter to estimate true fragment length.*° When the standard de- 
viation of six randomly sampled individuals (from a pool of 95 individuals with varied depositional histories (dated from approximately 
40 to 10579 years BP) were plotted onto a density distribution, the standard deviation from our six Chapelfield individuals plotted 
outside the 95% confidence intervals of the distribution (Figure S1B). The result highlights that there is significantly more variation 
in lambda (fragment length) within the six Chapelfield individuals than there is between a set of randomly chosen samples over 
the last ~10,000 years. Deposition history does not therefore predict fragmentation of DNA on this timescale. 


Inferring population ancestry 

Explorative genetic ancestry analysis was performed by PCA using LASER v.2.04’° following Hofmanova et al.,’° projecting Chapelfield 
BAMs onto a reference space of modern Eurasian populations.’“ Populations used: Southern European (Italian North/South, Sicilian, 
Spanish/- North, Canary Islander, Maltese, Greek), Basque, Sardinian, Cypriot, Central European (Albanian, Bulgarian, Romanian, Hun- 
garian, Croatian, Czech, German, French), Eastern European (Russian, Ukrainian, Belarussian, Polish, Sorb), Mordovian, Baltic and 
Finnish (Estonian, Lithuanian, Finnish), British Isles (English, Orcadian, Scottish, Irish/-Ulster, Shetlander), Scandinavian (Icelandic, Nor- 
wegian), Caucasian (Georgian, North Ossetian, Abkhasian, Chechen, Adygei, Lezgin, Kumyk, Balkar), West Asian (Turkish, Armenian), 
lranian/-Bandari, Near Eastern (Palestinian, Druze, Jordanian), Ashkenazi-, Turkish-, and North African Jews (Libyan, Moroccan, Tuni- 
sian). This used pileup files (SAMtools v.1.9°°) using filter criteria of minimum mapping quality 30 and minimum base quality 20. f4 admix- 
ture proportions were computed with goAdm from ADMIXTOOLS,° using default parameters and on pseudohaploid calls at the posi- 
tions overlapping with the human origin/IIlumina capture, on a non-related subset of Chapelfield individuals (SB604, SB605, SB676). 
Outgroups for goAdm were individuals with labels Han, Karitiana, Mbuti, and Papuan”’ and ancient genomes Russia_MA1_HG.SG, 
Ethiopia_4500BP_published.SG, Belgium_UP_GoyetQ116_1_published, Russia_Ust_Ishim.DG, Spain_ElMiron, retrieved from the Al- 
len Ancient DNA Resource.** Pairwise coalescence rates using published genomes (Data S1C) were estimated with Colate.“* Colate 
input was precomputed from recalibrated BAM files (ATLAS option ‘task=recal’) of the Chapelfield individuals, using the provided 
SGDP*® ‘half_ne_fixed’ mutation ages (Colate mode ‘make_tmp’). Similarly, we precomputed Colate input from VCF files for the 
SGDP samples used, additionally specifying the 1000 Genomes accessible regions’° as the target mask. Lastly, we estimated pairwise 
coalescence rates between each pair of samples (Colate mode ‘mut’), with accessible regions mask for Chapelfield individuals, 20 boot- 
straps, and epoques determined by bins ‘2.92,4.52,0.4’. 


Inferring pigmentation phenotypes 

Pigmentation phenotypes were predicted based on the HlrisPlex-S method*° on the three individuals with sufficient genomic 
coverage (SB604, SB605 and SB676, see Table S4). In case of missing genotypes in the VCF, we identify the corresponding positions 
in the recalibrated BAM files, and count one allele if the allele required by HlrisPlex-S is observed in at least one read (See Data S1H). 
We solely report the category with the highest probability if it is above 75%, and both the highest and second highest if the former is 
above 50% and the latter above 25%. See Table S2 in Chaitanya et al.*° for the author’s preliminary guide on how to interpret the skin 
pigmentation probability profiles. We obtained results only for the three individuals with sufficient genomic coverage. 


Inferring familial relationships and inbreeding 
To determine biological familial relationships among the six sequenced Chapelfield individuals we used the KING-robust method’° 
implemented in NgsRelate v.2° to estimate RO, R1 and KING- robust kinship statistics. This approach is intended for sequencing 
data of such low coverage that accurate genotypes cannot necessarily be called.’° 

To study parental relatedness among the sequenced Chapelfield individuals we first studied runs of homozygosity using pseudo- 
haploid data on the basis of a modern phased haplotype reference panel.°° Inbreeding coefficients were additionally calculated on 
the basis of the fraction of the genome estimated to be within homozygous-by-descent segments using the sliding window approach 
implemented in PLINK v1.09.°° Initially a MAF filter of 0.05 was applied, then data was pruned for linkage disequilibrium (command: 
PLINK -indep 50 2 2), before a sliding window of 50 SNPs was applied (command: PLINK -homozyg —homozyg-window-het 0 -ho- 
mozyg-snp 50 -homozyg-kb 1 -homozyg- density 5000 -homozyg-gap 5000). Detected ROH lengths (in cM) greater than the 
threshold above were then used to calculate inbreeding coefficients (Table S5). However, only SB604 had sufficient coverage for 
the inbreeding coefficient to be confidently calculated using this approach. ROHs were further inferred using a panel of reference 
haplotypes using hapROH,®° and ROHan*’ for the single individual with sufficient coverage (SB604). 


e4 Current Biology 32, 1-10.e1-e6, October 24, 2022 


Please cite this article in press as: Brace et al., Genomes from a medieval mass burial show Ashkenazi-associated hereditary diseases pre-date the 


12th century, Current Biology (2022), https://doi.org/10.1016/j.cub.2022.08.036 


Current Biology © CelPress 


Analysis of Ashkenazi-associated mendelian disorders 
To investigate Ashkenazi-associated Mendelian Disorders among the Chapelfield individuals we first collated a dataset of 178 SNVs 
interpreted as associated with disorders observed in Ashkenazi populations (Data S1E, with sources detailed) based on published 
data.'°'®°° InDel variants were detected by realignment (see above). We considered allele frequencies for these variants in gno- 
mAD”’ and retained 159 loci where the population allele frequency for modern Ashkenazi Jewish (ASJ) was greater than for modern 
non-Finnish European (NFE) for disease-associated variants (Data S1F). We considered genotypes for these loci probabilistically, 
introducing a read error parameter «, defining the probability that a single allele is incorrectly read as one of the other three nucle- 
otides. To determine the expected number of observed disease alleles at different rates of read error, datasets assuming ASJ 
and NFE population allele frequencies were simulated by sampling A,C,G,T nucleotides at each locus, for each individual from a 
multinomial distribution, using the observed total read depth as the number of trials. To calculate the exact probability of the observed 
allele reads, we applied a likelihood function utilizing «, summing the probability of the observed data for all 10° permutations of the 
ten possible genotypes at each locus. 
Probabilistic inference of genotypes 
In order to assess the frequency of disease alleles in the Chapelfield individuals, we needed to address two key problems associated 
with ancient DNA data. Firstly, read errors were likely to be present, such as observing nucleotide T at position 11:71146886 for in- 
dividual SB604, which is not present in any modern population in the gnomAD database. Secondly, read-depths were low, varying 
from 16.1 reads per locus for SB604 to only 0.17 reads per locus for SB606, with zero reads at 39.6% of loci when considering all 6 
individuals separately (see Figure S4). We addressed these problems by considering genotypes probabilistically (rather than making 
categorical calls) and introducing a read error parameter a, defining the probability that a single allele is incorrectly read as one of the 
other three nucleotides. This value is used globally (same value for each individual, and at each locus), and we assume symmetry 
between nucleotides, such that the probability of A incorrectly read as C is the same for all other nine pairwise errors. For example, 
we assume the probability of a true T being read as G is «/3. 
Simulating allele reads 
A single simulated dataset was generated in a three stage process. Firstly, the gnomAD allele counts of A,C,G,T (from a proposed 
population at a specific locus) were used as shape parameters in the Dirichlet distribution (plus one additional count for each nucle- 
otide, as a uniform prior), to generate a single set of four allele frequencies. Secondly, these proposed allele frequencies were modi- 
fied by the proposed read error rate «, according to the formulas D1, where: freqs = a vector of the proposed frequencies of A,C,G,T 
at a locus (Summing to 1); error = the proposed read error rate «. Thirdly, allele counts were randomly sampled from the multinomial 
distribution, where the total observed counts (across all four nucleotides) were used as the ‘number of trials’ parameter, and the allele 
frequencies (modified by «) were used as the multinomial probabilities. 

Formulas D1 (in R code): 

A <- freqs[1]*(1-error) + sum(freqs[c(2,3,4)])*error/3 

C <- freqs[2]*(1-error) + sum(freqs[c(1,3,4)])*error/3 

G <- freqs[3]*(1-error) + sum(freqs[c(1,2,4)])*error/3 

T <- freqs[4]*(1-error) + sum(freqs[c(1,2,3)])*error/3 
Likelihoods and likelihood ratios 
Likelihoods were calculated using a four stage process that utilized the observed allele read counts, proposed population allele fre- 
quencies, and the read-error rate a. Firstly, for a single locus, we generated all 1,000,000 permutations of the six individuals’ ten 
possible genotypes (AA, AC, AG, AT, CC, CG, CT, GG, GT, TT), and calculated the frequency of each genotype permutation, given 
the gnomAD population allele frequencies and assuming Hardy-Weinberg equilibrium. Where gnomAD data provided counts for 
exomes and genomes we used the combination (sum of counts) of both. Secondly, we calculated the likelihood of each individual’s 
ten genotypes (again at a single locus), using a proposed read-error rate « and the observed allele counts in the multinomial distri- 
bution as specified in formulas M1. Thirdly, we summed all 1,000,000 permutations of these likelihoods, weighted by the frequency of 
each genotype permutation (since each permutation is a possible explanation of the observed data). Fourthly, we repeated for each 
of the 159 loci, with the « parameter fixed across all loci, and the overall product (under the assumption that loci are independent) 
provided the exact probability of the observed data, under a model of thegnomAD allele frequencies and a single « parameter. 
This approach deliberately avoids making any categorical genotype calls, and instead maintains probabilistic genotypes for down- 
stream calculations. This is of particular value when analyzing aDNA where allele read depths are typically low and read errors high. In 
comparison, data with high read coverage and low read error rates can be assigned genotypes with such high confidence that the 
computational cost of this permutational method is not justified. Note, for computational efficiency, where two of the four possible 
nucleotides have a zero count, these can be aggregated into a single ‘other’ category requiring only 46,656 permutations of six ge- 
notypes (V1/V1, V1/V2, V1/V3, V2/V2, V2/V3, V3/V3), see formulas M2, and similarly where three nucleotides have zero counts, only 
729 permutations of three genotypes need calculating (V1/V1, V1/V2 and V2/V2), see formulas M3. Where all four nucleotides have 
zero counts there is no information, and the likelihood equals 1. 

Formulas (in R code): 

pi <- 1-error 

p2 <- error/3 

p3 <- 0.5 - p2 
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Formulas M1 if 4 nucleotides have counts, all 10 genotypes need to be considered. Likewise if only 3 nucleotides have counts, the 
fourth remains a possibility if produced by a read error. 
V1.V1 <- dmultinom(counts, prob=c(p1,p2,p2,p2)) 
V1.V2 <- dmultinom(counts, prob=c(p3,p3,p2,p2) 
V1.V3 <- dmultinom(counts, prob=c(p3,p2,p3,p2) 
V1.V4 <- dmultinom(counts, prob=c(p3,p2,p2,p3) 
V2.V2 <- dmultinom(counts, prob=c(p2,p1 ,p2,p2) 
V2.V3 <- dmultinom(counts, prob=c(p2,p3,p3,p2) 
V2.V4 <- dmultinom(counts, prob=c(p2,p3,p2,p3) 
V3.V3 <- dmultinom(counts, prob=c(p2,p2,p1 ,p2) 
V3.V4 <- dmultinom(counts, prob=c(p2,p2,p3,p3) 
V4.V4 <- dmultinom(counts, prob=c(p2,p2,p2,p1)) 
Formulas M2 if only 2 nucleotides have counts, the remaining two can be combined into a single ‘other’, so that 6 genotypes need 
to be considered: 
V1.V1 <- dmultinom(counts, prob=c 
V1.V2 <- dmultinom(counts, prob=c' 
V1.V3 <- dmultinom(counts, prob=c' 
( 
( 


SN ee ee 


) 
) 
) 
) 
) 
) 
) 
) 


p1,p2,p2+p2)) 
p3,p3,p2+p2)) 
p3,p2,p3+p2)) 
V2.V2 <- dmultinom(counts, prob=c' ) 
V2.V3 <- dmultinom(counts, prob=c(p2,p3,p3+p2)) 

V3.V3 <- dmultinom(counts, prob=c(p2,p2,p1+p2)) 

Formulas M3 if only 1 nucleotide has counts, the remaining 3 can be combined into a single ‘other’, so that only 3 genotypes need 
to be considered: 

V1.V1 <- dmultinom(counts, prob=c(p1,p2+p2+p2)) 

V1.V2 <- dmultinom(counts, prob=c(p3,p3+p2+p2)) 

V2.V2 <- dmultinom(counts, prob=c(p2,p1+p2+p2)) 

Our method calculates likelinoods under the assumption that the six individuals are randomly sampled from a proposed popula- 
tion, and therefore does not take into account relatedness. In the case of these particular data, this assumption has a conservative 
influence on the likelihood ratio for the following reason. The overwhelming majority of the likelihood ratio is driven by variants that are 
private to a single individual (SB676 21-33974609-G-C LR=113.4; SB605 7-83590853-G-A LR=67.8; SB676 5-112175211-T-A 
LR=48.6; SB696 22-50967020-C-T LR=2.7), which removes any influence of relatedness on the likelihood ratios. Four further 
non-private disease alleles were observed in SB604 and SB676 14-94770808-C-T LR=0.954; 14-97342370-C-T LR=1.079; 21- 
43808633-C-A LR=0.863; 21-45713715-C-T LR=0.870), but since the likelinood ratios at these loci overall slightly favor the european 
population (less than 1), adjusting for relatedness would have the effect of slightly increasing the likelihood ratio. In any case, our fa- 
milial relationship analysis did not find a close relationship between SB604 and SB676 that would justify such an adjustment. In 
contrast, the closest relationships identified were between siblings SB605, SB606 and SB671 who had no disease alleles in common. 


p2,p1,p2+p2 


2S See ek ee ee 


e6 Current Biology 32, 1-10.e1-e6, October 24, 2022 


